AITopics

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Singapore (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Neural Information Processing SystemsFeb-10-2026, 09:59:04 GMT

a3048e47310d6efaa4b1eaf55227bc92-Paper.pdf

arxiv preprint arxiv, contrastive view, laplacian perturbation, (11 more...)

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsAug-16-2025, 13:20:51 GMT

a3048e47310d6efaa4b1eaf55227bc92-Supplemental.pdf

artificial intelligence, machine learning, pacing function, (15 more...)

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Singapore (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)

Neural Information Processing SystemsAug-16-2025, 13:20:47 GMT

a3048e47310d6efaa4b1eaf55227bc92-Paper.pdf

artificial intelligence, latexit sha1, machine learning, (12 more...)

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

arXiv.org Artificial IntelligenceOct-14-2024

Towards General Deepfake Detection with Dynamic Curriculum

Song, Wentang, Lin, Yuzhen, Li, Bin

Most previous deepfake detection methods bent their efforts to discriminate artifacts by end-to-end training. However, the learned networks often fail to mine the general face forgery information efficiently due to ignoring the data hardness. In this work, we propose to introduce the sample hardness into the training of deepfake detectors via the curriculum learning paradigm. Specifically, we present a novel simple yet effective strategy, named Dynamic Facial Forensic Curriculum (DFFC), which makes the model gradually focus on hard samples during the training. Firstly, we propose Dynamic Forensic Hardness (DFH) which integrates the facial quality score and instantaneous instance loss to dynamically measure sample hardness during the training. Furthermore, we present a pacing function to control the data subsets from easy to hard throughout the training process based on DFH. Comprehensive experiments show that DFFC can improve both within- and cross-dataset performance of various kinds of end-to-end deepfake detectors through a plug-and-play approach. It indicates that DFFC can help deepfake detectors learn general forgery discriminative features by effectively exploiting the information from hard samples.

artificial intelligence, computer vision, machine learning, (17 more...)

doi: 10.1109/ICASSP48485.2024.10448345

2410.11162

Country: Asia > China > Guangdong Province > Shenzhen (0.05)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Khan, Muhammad Asif, Menouar, Hamid, Hamila, Ridha

Curriculum for Crowd Counting -- Is it Worthy?

arXiv.org Artificial IntelligenceJan-15-2024

Recent advances in deep learning techniques have achieved remarkable performance in several computer vision problems. A notably intuitive technique called Curriculum Learning (CL) has been introduced recently for training deep learning models. Surprisingly, curriculum learning achieves significantly improved results in some tasks but marginal or no improvement in others. Hence, there is still a debate about its adoption as a standard method to train supervised learning models. In this work, we investigate the impact of curriculum learning in crowd counting using the density estimation method. We performed detailed investigations by conducting 112 experiments using six different CL settings using eight different crowd models. Our experiments show that curriculum learning improves the model learning performance and shortens the convergence time.

crowd counting, curriculum, pacing function, (12 more...)

2401.07586

Country:

Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.40)

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Martinez, Richard Diehl, Goriely, Zebulon, McGovern, Hope, Davis, Christopher, Caines, Andrew, Buttery, Paula, Beinborn, Lisa

CLIMB: Curriculum Learning for Infant-inspired Model Building

arXiv.org Artificial IntelligenceNov-15-2023

We describe our team's contribution to the STRICT-SMALL track of the BabyLM Challenge. The challenge requires training a language model from scratch using only a relatively small training dataset of ten million words. We experiment with three variants of cognitively-motivated curriculum learning and analyze their effect on the performance of the model on linguistic evaluation tasks. In the vocabulary curriculum, we analyze methods for constraining the vocabulary in the early stages of training to simulate cognitively more plausible learning curves. In the data curriculum experiments, we vary the order of the training instances based on i) infant-inspired expectations and ii) the learning behavior of the model. In the objective curriculum, we explore different variations of combining the conventional masked language modeling task with a more coarse-grained word class prediction task to reinforce linguistic generalization capabilities. Our results did not yield consistent improvements over our own non-curriculum learning baseline across a range of linguistic benchmarks; however, we do find marginal gains on select tasks. Our analysis highlights key takeaways for specific combinations of tasks and settings which benefit from our proposed curricula. We moreover determine that careful selection of model architecture, and training hyper-parameters yield substantial improvements over the default baselines provided by the BabyLM challenge.

computational linguistic, curriculum, vanilla model, (15 more...)

2311.08886

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
(8 more...)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.55)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.34)

Khan, Muhammad Asif, Hamila, Ridha, Menouar, Hamid

CLIP: Train Faster with Less Data

arXiv.org Artificial IntelligenceAug-2-2023

Deep learning models require an enormous amount of data for training. However, recently there is a shift in machine learning from model-centric to data-centric approaches. In data-centric approaches, the focus is to refine and improve the quality of the data to improve the learning performance of the models rather than redesigning model architectures. In this paper, we propose CLIP i.e., Curriculum Learning with Iterative data Pruning. CLIP combines two data-centric approaches i.e., curriculum learning and dataset pruning to improve the model learning accuracy and convergence speed. The proposed scheme applies loss-aware dataset pruning to iteratively remove the least significant samples and progressively reduces the size of the effective dataset in the curriculum learning training. Extensive experiments performed on crowd density estimation models validate the notion behind combining the two approaches by reducing the convergence time and improving generalization. To our knowledge, the idea of data pruning as an embedded process in curriculum learning is novel.

artificial intelligence, deep learning, machine learning, (14 more...)

2212.01452

Country:

Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

arXiv.org Artificial IntelligenceDec-24-2022

When Do Curricula Work in Federated Learning?

Vahidian, Saeed, Kadaveru, Sreevatsank, Baek, Woonjoon, Wang, Weijia, Kungurtsev, Vyacheslav, Chen, Chen, Shah, Mubarak, Lin, Bill

An oft-cited open problem of federated learning is the existence of data heterogeneity at the clients. One pathway to understanding the drastic accuracy drop in federated learning is by scrutinizing the behavior of the clients' deep models on data with different levels of "difficulty", which has been left unaddressed. In this paper, we investigate a different and rarely studied dimension of FL: ordered learning. Specifically, we aim to investigate how ordered learning principles can contribute to alleviating the heterogeneity effects in FL. We present theoretical analysis and conduct extensive empirical studies on the efficacy of orderings spanning three kinds of learning: curriculum, anti-curriculum, and random curriculum. We find that curriculum learning largely alleviates non-IIDness. Interestingly, the more disparate the data distributions across clients the more they benefit from ordered learning. We provide analysis explaining this phenomenon, specifically indicating how curriculum training appears to make the objective landscape progressively less convex, suggesting fast converging iterations at the beginning of the training procedure. We derive quantitative results of convergence for both convex and nonconvex objectives by modeling the curriculum training on federated devices as local SGD with locally biased stochastic gradients. Also, inspired by ordered learning, we propose a novel client selection technique that benefits from the real-world disparity in the clients. Our proposed approach to client selection has a synergic effect when applied together with ordered learning in FL.

artificial intelligence, curriculum, machine learning, (17 more...)

2212.12712

Country:

North America > United States > Virginia (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.48)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Karakasidis, Georgios, Grósz, Tamás, Kurimo, Mikko

Comparison and Analysis of New Curriculum Criteria for End-to-End ASR

arXiv.org Artificial IntelligenceAug-10-2022

It is common knowledge that the quantity and quality of the training data play a significant role in the creation of a good machine learning model. In this paper, we take it one step further and demonstrate that the way the training examples are arranged is also of crucial importance. Curriculum Learning is built on the observation that organized and structured assimilation of knowledge has the ability to enable faster training and better comprehension. When humans learn to speak, they first try to utter basic phones and then gradually move towards more complex structures such as words and sentences. This methodology is known as Curriculum Learning, and we employ it in the context of Automatic Speech Recognition. We hypothesize that end-to-end models can achieve better performance when provided with an organized training set consisting of examples that exhibit an increasing level of difficulty (i.e. a curriculum). To impose structure on the training set and to define the notion of an easy example, we explored multiple scoring functions that either use feedback from an external neural network or incorporate feedback from the model itself. Empirical results show that with different curriculums we can balance the training times and the network's performance.

curriculum, epoch, pacing function, (13 more...)

2208.05782

Country:

Europe > Finland (0.05)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.55)